53 research outputs found

    Solving the Batch Stochastic Bin Packing Problem in Cloud: A Chance-constrained Optimization Approach

    Full text link
    This paper investigates a critical resource allocation problem in the first party cloud: scheduling containers to machines. There are tens of services and each service runs a set of homogeneous containers with dynamic resource usage; containers of a service are scheduled daily in a batch fashion. This problem can be naturally formulated as Stochastic Bin Packing Problem (SBPP). However, traditional SBPP research often focuses on cases of empty machines, whose objective, i.e., to minimize the number of used machines, is not well-defined for the more common reality with nonempty machines. This paper aims to close this gap. First, we define a new objective metric, Used Capacity at Confidence (UCaC), which measures the maximum used resources at a probability and is proved to be consistent for both empty and nonempty machines, and reformulate the SBPP under chance constraints. Second, by modeling the container resource usage distribution in a generative approach, we reveal that UCaC can be approximated with Gaussian, which is verified by trace data of real-world applications. Third, we propose an exact solver by solving the equivalent cutting stock variant as well as two heuristics-based solvers -- UCaC best fit, bi-level heuristics. We experimentally evaluate these solvers on both synthetic datasets and real application traces, demonstrating our methodology's advantage over traditional SBPP optimal solver minimizing the number of used machines, with a low rate of resource violations.Comment: To appear in SIGKDD 2022 as Research Track pape

    XNET: A Real-Time Unified Secure Inference Framework Using Homomorphic Encryption

    Get PDF
    Homomorphic Encryption (HE) presents a promising solution to securing neural networks for Machine Learning as a Service (MLaaS). Despite its potential, the real-time applicability of current HE-based solutions remains a challenge, and the diversity in network structures often results in inefficient implementations and maintenance. To address these issues, we introduce a unified and compact network structure for real-time inference in convolutional neural networks based on HE. We further propose several optimization strategies, including an innovative compression and encoding technique and rearrangement in the pixel encoding sequence, enabling a highly efficient batched computation and reducing the demand for time-consuming HE operations. To further expedite computation, we propose a GPU acceleration engine to leverage the massive thread-level parallelism to speed up computations. We test our framework with the MNIST, Fashion-MNIST, and CIFAR-10 datasets, demonstrating accuracies of 99.14%, 90.8%, and 61.09%, respectively. Furthermore, our framework maintains a steady processing speed of 0.46 seconds on a single-thread CPU, and a brisk 31.862 milliseconds on an A100 GPU for all datasets. This represents an enhancement in speed more than 3000 times compared to pervious work, paving the way for future explorations in the realm of secure and real-time machine learning applications

    Leveraging GPU in Homomorphic Encryption: Framework Design and Analysis of BFV Variants

    Get PDF
    Homomorphic Encryption (HE) enhances data security by facilitating computations on encrypted data, opening new paths for privacy-focused computations. The Brakerski-Fan-Vercauteren (BFV) scheme, a promising HE scheme, raises considerable performance challenges. Graphics Processing Units (GPUs), with considerable parallel processing abilities, have emerged as an effective solution. In this work, we present an in-depth study focusing on accelerating and comparing BFV variants on GPUs, including Bajard-Eynard-Hasan-Zucca (BEHZ), Halevi-Polyakov-Shoup (HPS), and other recent variants. We introduce a universal framework accommodating all variants, propose optimized BEHZ implementation, and first support HPS variants with large parameter sets on GPUs. Moreover, we devise several optimizations for both low-level arithmetic and high-level operations, including minimizing instructions for modular operations, enhancing hardware utilization for base conversion, implementing efficient reuse strategies, and introducing intra-arithmetic and inner-conversion fusion methods, thus decreasing the overall computational and memory consumption. Leveraging our framework, we offer comprehensive comparative analyses. Our performance evaluation showcases a marked speed improvement, achieving 31.9× over OpenFHE running on a multi-threaded CPU and 39.7% and 29.9% improvement, respectively, over the state-of-the-art GPU BEHZ implementation. Our implementation of the leveled HPS variant records up to 4× speedup over other variants, positioning it as a highly promising alternative for specific applications

    Implementing and Benchmarking Word-Wise Homomorphic Encryption Schemes on GPU

    Get PDF
    Homomorphic encryption (HE) is one of the most promising techniques for privacy-preserving computations, especially the word-wise HE schemes that allow batched computations over ciphertexts. However, the high computational overhead hinders the deployment of HE in real-word applications. The GPUs are often used to accelerate the execution in such scenarios, while the performance of different HE schemes on the same GPU platform is still absent. In this work, we implement three word-wise HE schemes BGV, BFV, and CKKS on GPU, with both theoretical and engineering optimizations. We optimize the hybrid key-switching technique, reducing the computational and memory overhead of this procedure. We explore several kernel fusing strategies to reuse data, which reduces the memory access and IO latency, and improves the overall performance. By comparing with the state-of-the-art works, we demonstrate the effectiveness of our implementation. Meanwhile, we present a framework that finely integrates our implementation of the three schemes, covering almost all scheme functions and homomorphic operations. We optimize the management of pre-computation, RNS bases and memory in the framework, to provide efficient and low-latency data access and transfer. Based on this framework, we provide a thorough benchmark of the three schemes, which can serve as a reference for scheme selection and implementation in constructing privacy-preserving applications

    The Healing Process of Intracorporeally and In Situ Devitalized Distal Femur by Microwave in a Dog Model and Its Mechanical Properties In Vitro

    Get PDF
    Background: Limb-salvage surgery has been well recognized as a standard treatment and alternative to amputation for patients with malignant bone tumors. Various limb-sparing techniques have been developed including tumor prosthesis, allograft, autograft and graft-prosthesis composite. However, each of these methods has short- and long-term disadvantages such as nonunion, mechanical failures and poor limb function. The technique of intracorporeal devitalization of tumor-bearing bone segment in situ by microwave-induced hyperthermia after separating it from surrounding normal tissues with a safe margin is a promising limb-salvage method, which may avoid some shortcomings encountered by the above-mentioned conventional techniques. The purpose of this study is to assess the healing process and revitalization potential of the devitalized bone segment by this method in a dog model. In addition, the immediate effect of microwave on the biomechanical properties of bone tissue was also explored in an in vitro experiment. Methods: We applied the microwave-induced hyperthermia to devitalize the distal femurs of dogs in situ. Using a monopole microwave antenna, we could produce a necrotic bone of nearly 20 mm in length in distal femur. Radiography, bone scintigraphy, microangiography, histology and functional evaluation were performed at 2 weeks and 1, 2, 3, 6, 9 and 12 months postoperatively to assess the healing process. In a biomechanical study, two kinds of bone specimens, 3 and 6 cm in length, were used for compression and three-point bending test respectively immediately after extracorporeall

    Gold Nanoparticles Increase PLK1-Specific Small Interfering RNA Transfection and Induce Apoptosis of Drug Resistance Breast Cancer Cells

    No full text
    Drug resistance is a major barrier that limits the effectiveness of chemotherapies against breast cancer. Here, gold nanoparticles (GNPs) characterized by good dispersivity, high stability, low cytotoxicity, and simple synthesis were developed to deliver small interfering RNA (siRNA) against PLK1 (PLK1-siRNA) and overcome the drug resistance of breast cancer cells. Compared with the commonly used Lipofectamine 2000, GNPs showed higher PLK1-siRNA delivery efficiency and resulted in the remarkable gene silencing of PLK1 in drug resistance breast cancer cells MCF-7/MDR1 with low cytotoxicity in vitro. Moreover, delivery of PLK1-siRNA by GNPs could cause 14.23% apoptosis of MCF-7/MDR1 cells, which was apparently higher than 11.01% apoptosis conducted by Lipofectamine 2000. In addition, GNPs showed strong X-ray attenuation coefficient, indicating the potential theranostic application of this system. Therefore, this study disclosed an important step in the use of GNPs as transfection vector of siRNA that will be of great benefit to gene therapy against drug resistant cancer

    Single-Crystalline Si1−xGex (x = 0.5~1) Thin Films on Si (001) with Low Threading Dislocation Density Prepared by Low Temperature Molecular Beam Epitaxy

    No full text
    Single-crystalline Si1−xGex thin films on Si (100) with low threading dislocation density (TDD) are highly desired for semiconductor industrials. It is challenging to suppress the TDD since there is a large mismatch (4.2%) between Ge and Si—it typically needs 106–107/cm2 TDD for strain relaxation, which could, however, cause device leakage under high voltage. Here, we grew Si1−xGex (x = 0.5–1) films on Si (001) by low temperature molecular beam epitaxy (LT-MBE) at 200 °C, which is much lower than the typical temperature of 450–600 °C. Encouragingly, the Si1−xGex thin films grown by LT-MBE have shown a dramatically reduced TDD down to the 103–104/cm2 level. Using transmission electron microscopy (TEM) with atomic resolution, we discovered a non-typical strain relaxation mechanism for epitaxial films grown by LT-MBE. There are multiple-layered structures being introduced along out-of-plane-direction during film growth, effectively relaxing the large strain through local shearing and subsequently leading to an order of magnitude lower TDD. We presented a model for the non-typical strain relaxation mechanism for Si1−xGex films grown on Si (001) by LT-MBE

    Relationship between C2 slope with sagittal parameters and clinical function of degenerative cervical kyphosis

    No full text
    Abstract Purpose To explore the relationship between C2 slope with sagittal parameters and clinical function of degenerative cervical kyphosis (DCK). Methods A retrospective analysis of 127 patients with degenerative cervical spondylosis treated in our spinal deformity center from January 2019 to June 2022. Patients were categorized into two groups and compared based on C2-7 angle (C2-7 ≥ 5° as kyphosis group, C2-7 < 5° as lordosis group). Pearson correlation or Spearman correlation was used to analyze the relationship between C2S and conventional radiological parameters and health -related quality-of-life (HRQOL) outcomes as measured by the EuroQol 5 dimension questionnaire (EQ5D), NRS, and the neck disability index (NDI). The cutoff value of C2S was determined by a receiver operating characteristic (ROC) curve. Results There were 127 patients who met inclusion criteria (79 men and 48 women). Average 56.00 ± 10.27 years old (range 31–81 years old). C2S of kyphosis group is higher than non-kyphosis group. Aggravating cervical kyphosis increases cSVA positively. For all patients, C2S demonstrated a significant correlation with the O-C2 angle, C2-7 angle, cSVA, and TS-CL (p < 0.05). NRS, NDI and EQ5D-VAS scores revealed a significant correlation with C2S and cSVA (p < 0.01). For the subgroup of patients presenting with DCK, ROC curves demonstrated the cutoff values of C2S as 26.3°, and 30.5°, according to a cSVA of 40 mm, and severe disability expressed by NDI, respectively. Conclusion On the basis of retaining the consistency of cranio-cervical and cervico-thoracic structure, C2S can better analyze the sagittal alignment of DCK patients than TS-CL and has good practicability in clinical application and HRQOL evaluation
    corecore